专利摘要:
Image processing method and image processing system for recognizing characters included in an image. A first character recognition unit performs recognition of a first group of characters corresponding to a first region of the image. A unit of measure calculates a measure of confidence of the first group of characters. A determination unit determines whether further recognition should be performed based on the confidence measure. A selection unit selects a second region of the image which includes the first region, if it is determined that further recognition is to be performed. A second character recognition unit performs further recognition of a second group of characters corresponding to the second region of the image.
公开号:BE1026039B1
申请号:E20195125
申请日:2019-02-28
公开日:2019-12-13
发明作者:Frédéric Collet;Jordi Hautot;Michel Dauw
申请人:Iris Sa;
IPC主号:
专利说明:

IMAGE PROCESSING METHOD AND IMAGE PROCESSING SYSTEM [0001] Technical Field [0002] The present invention relates to an image processing method and an image processing system for recognizing the characters included in an image. In particular, the present invention relates to the recognition of characters from an image.
State of the art [0004] Character recognition is carried out to convert text included in an image into machine-coded text. Images that can be analyzed using character recognition software include a scanned document, a photograph of a document, a photograph of a scene, a video recording, and text that has been overlaid on a document. . Text in the image that can be converted includes typed, handwritten, and printed text. Machine-coded text includes all character coding standards for electronic communications, such as ASCII, Unicode, and emoji. Character recognition applications include:
- the display to a user of machine-coded characters which correspond to the text included in the image;
- the overlay of the image with the machine-coded characters, so that the text can be selected by the user;
- the provision of a search function for the text included in the image by allowing the search for machine-coded text;
- machine reading where a computer device interprets the context of the text, including in the image;
- entering machine-coded characters that correspond to the text included in the image;
BE2019 / 5125
- automatic recognition of number plates; and
- converting handwriting in real time, to enter text into a computer device.
Character recognition software is configured to receive an input image and output a machine-coded text. In addition, character recognition software can perform error analysis to determine a confidence measure of the machine-coded text that is output.
The term character recognition designates the identification and recognition of individual characters in the image. However, the term character recognition is also used to understand word recognition, where identification and recognition occurs one word at a time. Character recognition is illustrated by optical character recognition, optical word recognition, intelligent character recognition and intelligent word recognition.
Character recognition is adapted on the basis of the writing system which is included in the document, such as Latin, Cyrillic, Arabic, Hebrew, Indian, Bengali, Devanagari, Tamil, Chinese, Japanese, Korean, Morse characters and braille. Character recognition is then adapted on the basis of the language of the text included in the image. The writing system and the language of the text can be identified by the user, or the writing system and the language of the text can be identified by the character recognition software from the context of the characters and words which are recognized. In addition, character recognition can be adapted to process documents that include text in several writing systems or languages.
BE2019 / 5125 [0008] Character recognition occurs by associating machine-coded characters with at least one example of a glyph that could be found in an image. The accuracy of character recognition is improved by increasing the number of glyphs that represent a machine-encoded character. This is particularly useful for improving the accuracy of recognition of a variety of fonts. Intelligent recognition is achieved by using machine learning to form a computer system that uses a neural network. Intelligent recognition improves recognition of characters that do not match the glyphs stored as examples.
Machine-coded text often contains errors. Errors can be corrected by the user re-reading the machine-coded text. This is a disadvantage for the user, and techniques are therefore available to improve the accuracy of character recognition and to improve error detection. For example, the accuracy of character recognition can be increased if the output is influenced by a lexicon, which is a dictionary of the words that are expected in a document. Error detection can be improved by performing a spell or grammar check to assess the context of machine-coded text.
Character recognition has been developed which specializes in detection under specific conditions. Character recognition is particularly difficult if the state of the image is variable, in which case the most appropriate character recognition technique must be carefully selected. For example :
- the character recognition software is generally suitable for reading the clean pages of a document which have been scanned by a
BE2019 / 5125 multifunction device, in which case errors may occur if the image includes a watermark;
- the character recognition software can be adapted to read low quality documents, in which case the characters output will be more precise than character recognition software adapted to read high quality documents; and
- automatic recognition of license plates is adapted for reading vehicle license plates, which is further improved to adapt to different weather conditions and to different types of vehicle license plates.
Character recognition software is specialized for improving accuracy. However, character recognition software consumes computer resources, such as processing power. In addition, the IT resources to be used affect the execution time of the character recognition software. The computer resources to consume depend on the character recognition technique chosen, and the computer resources also depend on the quality of the image. Therefore, a compromise must be found between the available IT resources and the level of precision desired.
Summary of the invention [0013] Aspects of the present invention are defined by the independent claims.
According to a first aspect, an image processing method is provided for recognizing characters included in an image, the image processing method comprising the steps consisting in: carrying out recognition of a first group of characters corresponding to a first region of the image; calculating a confidence measure of the first group of characters; determining whether additional recognition should be made on the basis of the confidence measure; select a
BE2019 / 5125 second region of the image which includes the first region, if it is determined that the additional recognition must be carried out; and performing the additional recognition of a second group of characters corresponding to the second region of the image.
In a second aspect, an image processing system is provided for recognizing characters included in an image, the image processing system comprising: a first character recognition unit configured to perform recognition of characters. a first group of characters corresponding to a first region of the image; a measurement unit configured to calculate a confidence measure of the first group of characters; a determination unit configured to determine whether further recognition is to be made based on the confidence measure; a selection unit configured to select a second region of the image that includes the first region, if it is determined that additional recognition is to be performed; and a second character recognition unit configured to perform additional recognition of a second group of characters corresponding to the second region of the image. The characteristics of the image processing system can be provided by one or more devices.
Optionally, the image processing system includes an image processing device comprising both the first character recognition unit and the second character recognition unit.
Optionally, the image processing system comprises: a first image processing device comprising the first character recognition unit; and a second image processing device comprising the second character recognition unit.
BE2019 / 5125 According to a third aspect, a program is provided which, when implemented by an image processing system, causes the image processing system to carry out a method according to the second aspect.
According to a fourth aspect, a computer-readable medium is provided which stores a program according to the third aspect.
Advantageously, the best unit of a plurality of character recognition units is used to recognize the characters in the image. Therefore, character recognition is performed using character recognition units which are adapted according to the image. If the image includes a plurality of conditions, the character recognition units are assigned regions of the image to which they are adapted. The allocation of resources is optimized by providing for character recognition with high computational intensity over a region of the image identified as being of low quality.
Optionally, the image processing method performed by the image processing system further comprises the steps of: performing a recognition of a plurality of first groups of characters corresponding to a plurality of first regions of the 'picture ; calculating a confidence measure for each of the first groups of characters; determining whether further recognition should be made for each of the first groups of characters based on the corresponding confidence measure; selecting a plurality of second regions of the image which each include the first corresponding region if it is determined that further recognition is to be performed; and further recognizing a plurality of second groups of characters corresponding to the plurality of second regions of the image. Advantageously, the additional recognition is carried out for a plurality of second regions, and thus a plurality of errors will be corrected.
BE2019 / 5125 Optionally, the step consisting in determining whether an additional recognition should be carried out comprises the selection of a maximum number of first groups of characters, on the basis of the confidence measure for each of the first groups of characters . Advantageously, an additional recognition is carried out a maximum of times, so that the available computer resources are allocated appropriately.
Optionally, the recognition of the first group of characters comprises at least one operation from: a matrix matching, in which the first region is compared to a glyph; and a feature extraction, in which the first region is compared to a plurality of features of a glyph. Matrix matching and feature extraction are techniques that are performed by the first character recognition unit. Matrix matching and feature extraction can be done individually or in combination. Advantageously, there is a synergy between the recognition of the first characters and the additional recognition of the second characters, so that a small amount of processing is used by the first character recognition unit, so that the computing resources can be allocated error correction.
Optionally, the confidence measure is based on an average weighting for all the characters of the first group of characters. Advantageously, a word is identified for which the confidence measure is low on average over all the characters of the word.
Optionally, the confidence measure is based on a maximum weighting for all the characters of the first group of characters. Advantageously, a word is identified for which the confidence measure is weak for a particular character of the word.
BE2019 / 5125 Optionally, it is determined that an additional recognition must be carried out if the confidence measurement is less than a threshold value. Advantageously, an assessment of the advisability of additional recognition is carried out, so that the IT resources are allocated appropriately. Consequently, if a plurality of errors are identified, these errors can be corrected by carrying out a new recognition in order of priority.
Optionally, it is determined that additional recognition must be carried out if the first group of characters corresponds to the text in the first region which is identified as having a number of pixels less than a threshold value. Advantageously, a small number of pixels indicates that it is likely that the character recognition will contain errors.
Optionally, it is determined that additional recognition must be carried out if the first group of characters corresponds to the text of the first region which is identified as having a number of pixels less than a threshold value. Advantageously, a small number of pixels indicates that it is likely that the character recognition will contain errors. Consequently, the additional recognition can be adapted for the analysis of documents containing characters having a small number of pixels.
Optionally, it is determined that an additional recognition must be carried out if the first group of characters corresponds to the text of the first region which is identified as having a height less than a threshold value. Advantageously, a low height results in characters having a low number of pixels, which indicates that it is probable that the recognition of characters will contain errors. As a result, additional recognition may be
BE2019 / 5125 suitable for the analysis of documents containing texts of variable height, such as the covers of magazines and newspapers.
Optionally, the additional recognition of the second group of characters is suitable for an image which is of low quality. Advantageously, the accuracy of character recognition has improved thanks to the use of a second character recognition unit suitable for the type of image selected.
Optionally, the additional recognition of the second group of characters is adapted to the second region of the image. Advantageously, the accuracy of character recognition has improved thanks to the use of a second character recognition unit adapted for the type of second region which has been selected.
Optionally, the additional recognition of the second group of characters is specialized for a region of a low quality image. It is possible to evaluate the second region to determine the quality level, with a second unit for recognizing selected characters which will result in a second group of characters released for which the measure of confidence will be high. Advantageously, the accuracy of character recognition is improved by the use of a second character recognition unit which is suitable for analyzing low quality images.
Optionally, additional recognition of the second group of characters uses a neural network. The neural network used was formed to recognize a plurality of word chains. Advantageously, the word chains provide the neural network with contextual information, so that the second character recognition unit is adapted to recognize words which are difficult to recognize in isolation.
BE2019 / 5125 Optionally, the second region further comprises words which are identified as being adjacent to the first region. Advantageously, the adjacent words provide the context for the first region, so that the confidence measure is expected to be improved, thereby increasing the probability that the error will be corrected.
Optionally, the second region further includes words which are identified as being on the same line of text as the first region. Advantageously, the word on the same line of text as the first region provides context for the first region, so that the confidence measure is expected to be improved, thereby increasing the probability that the error will be corrected.
Optionally, the second region further includes words which are identified as providing context to the first region. Advantageously, a context measure is used to actively identify a second region which will provide context to the first region. Therefore, it is expected that the confidence measure will be improved, which will increase the probability that the error will be corrected.
Short description of the figures The embodiments will now be described, by way of example only, with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram which illustrates an image processing system for recognizing characters included in an image;
FIG. 2 is a flowchart illustrating an image processing method for recognizing characters included in an image;
FIG. 3A is a diagram illustrating a first region for which the recognition of characters results in a first group of
BE2019 / 5125 characters, and a second region for which character recognition results in a second group of characters;
FIG. 3B is a diagram illustrating a plurality of first regions for which character recognition results in a plurality of first groups of characters, and a plurality of second regions for which character recognition results in a plurality of second groups of characters;
FIG. 4A gives an example of a first group of characters which it is determined to contain an error based on a measure of confidence;
FIG. 4B gives an example of a group of characters which includes the first group of characters; and FIG. 4C provides an example of a second group of characters for which the errors have been corrected;
FIG. 5A provides an example of a first region for which it is determined that additional recognition must be carried out on the basis of a confidence measure;
Figure 5B gives an example of a second region selected by an image processing system, where the second region includes the first region; and FIG. 5C provides an example of a line of text in an image, which identifies the first region and the second region.
Detailed description Various examples of embodiments, characteristics and aspects of the invention will be described in detail below with reference to the drawings.
BE2019 / 5125
Each embodiment of the present invention described below can be implemented independently or as a combination of a plurality of embodiments or features thereof if necessary or when the combination of elements or features of individual embodiments in a single embodiment is beneficial.
FIG. 1 is a schematic diagram which illustrates an image processing system 100 for identifying the text included in an image. The image processing system 100 includes an input 101 and an output 102, a plurality of character recognition units 120, a processor 130 and a memory 140. The image processing system 100 is illustrated by a single device image processing 100 which includes the plurality of character recognition units 120. Alternatively, the image processing system 100 could include a plurality of image processing devices, each having a character recognition unit.
The plurality of character recognition units 120 includes at least a first character recognition unit 121 and a second character recognition unit 122, and may include other character recognition units. Each character recognition unit 120 has the function of identifying the characters in a region of an image and of associating the identified characters with machine-coded text. The characters of the image are identified and recognized from the analysis of the pixels of the region of the image. Characters can be recognized in a selection of languages, in a variety of fonts.
The different character recognition units 120 are adapted so that character recognition is optimized for specific conditions. The quality of the image, the language of the text, the font of the text, whether the text is typed or handwritten, and the
BE2019 / 5125 available IT resources are examples of specific conditions.
The first character recognition unit 121 is configured to recognize all the text of the image, and in particular a first group of characters corresponding to a first region of the image. The first character recognition unit 121 performs character recognition using conventional techniques to recognize text in the image. Over segmentation is used to identify the characters in the image. A character identified in the image is compared to a plurality of reference glyphs which are stored in a memory of the image processing system 100. A number of techniques are available for comparing a character identified in the image with the glyphs. benchmarks, such as matrix matching and feature extraction. Matrix matching involves comparing the pixel pattern of the identified character with the pixel pattern of the reference glyphs. Characteristic extraction breaks the input character into characteristics such as lines, closed loops, line direction and line intersections, and these extracted characteristics are then compared to the corresponding characteristics of the reference glyphs.
The first region of the image is identified in retrospect of the character recognition performed on the entire image, as a result of the analysis of the first group of characters. Alternatively, the first region can also be identified before or while character recognition is performed. The first recognition unit 121 is fast and reliable when analyzing plain text which has not been hidden.
The second character recognition unit 122 is configured to further recognize a second group of characters corresponding to a second region of the image. The second unit of
BE2019 / 5125 character recognition 122 performs additional recognition of the second group of characters using a neural network which has been trained to recognize a plurality of word strings. The second character recognition unit 122 uses the conventional techniques available to recognize text in an image using a neural network. There is a synergy between the recognition of the entire document by the first optical character recognition unit 121, followed by the additional recognition of the second region by the second optical character recognition unit 122, which has the technical effect that the computing resources are allocated to the correction of errors
Word chains provide context information to the neural network, so that the second character recognition unit 122 is adapted to recognize difficult-to-recognize words taken in isolation. In addition, the neural network can be trained so that low quality images can be recognized accurately. The neural network is learned by entering representations of the characters to be recognized. The training phase performs a gradient descent technique so that the neural network is optimized by reducing output errors. Machine-coded text output is based on a probability measure from a comparison with the text samples that entered during the training phase. The anticipatory processing of the neural network is carried out so that there is convergence towards the probability measure. The neural network is used to adapt the second character recognition unit so that it can perform the recognition of character characters that were not encountered during the formation of the neural network.
The second character recognition unit 122 provides better recognition of the text which has been masked, although it is less precise than the first recognition unit 121 in terms of
BE2019 / 5125 for recognition of plain text that has not been hidden. The second character recognition unit 122 improves accuracy when recognizing text in a low quality image. However, the achievements of the second character recognition unit 122 are computationally intensive, which results in image processing being slow and consuming more processing resources.
Consequently, it is necessary to find a balance between the desired level of precision and the allocation of resources. To do this, image recognition is carried out on the entire document using the first character recognition unit 121 and, if it is determined that an additional character recognition must be carried out, additional character recognition using the second character recognition unit 122.
The processor 130 operates as a measurement unit 131, a determination unit 132 and a selection unit 133. The measurement unit 131 is configured to calculate a confidence measure of the first group of characters. The determination unit 132 is configured to determine whether additional recognition should be made based on the confidence measure. The selection unit 133 is configured to select the second region of the image, the second region comprising the first region. Accordingly, processor 130 is configured to identify how to improve accuracy and allocate resources efficiently, using the character recognition provided by the plurality of character recognition units 120.
The analysis of the first character recognition unit 121 over the entire image results in a machine-coded character string corresponding to all the text which has been identified in the image. The measurement unit 131 provides a confidence value, so that the determination unit 132 can determine whether the machine-encoded character string contains errors. Error identification can be used
BE2019 / 5125 retrospectively by the selection unit 133 to identify a first region of the image for which additional recognition must be carried out. Once it has been determined that further recognition is to be made, the selection unit 133 identifies a second region of the image which includes the first region, thereby providing additional information which will be useful in further evaluating the first region.
As an alternative, the first region could be selected before the first character recognition unit 121 performs character recognition on the entire document. This allows the first region to be determined in advance as part of the image for which the first group of characters is to be checked. This makes it possible to give priority to certain parts of the image, for example, if the user has identified that this part of the image is particularly important, or if it has been determined that the first region of the image is low quality.
Input 101 and output 102 are configured to receive and transmit electronic data. The input 101 is configured to receive the image to be analyzed, for example from a local network, from the Internet or from an external memory. In addition, input 101 is configured to receive instructions from a user via, for example, a mouse or keyboard. Output 102 is configured to output the text that has been identified. The output 102 includes a display allowing the text to be identified to the user. Output 102 includes a network connection for communicating over the Internet.
The characteristics of the image processing device 100 can be organized differently. For example, each of the character recognition units 120 may include a processor 130 which is configured to serve as a measurement unit 131, a determination unit 132 and a selection unit 133. The plurality of recognition units
BE2019 / 5125 characters 120 can be part of the same device or alternatively be distributed as a system on a plurality of devices.
The image processing device 100 can be part of a personal computer. Alternatively, the image processing device 100 can be part of a multifunction device, further comprising a scanner, a copier, a fax machine and a printer.
FIG. 2 is a flowchart illustrating an image processing method S200 for identifying the text included in an image 300. The image processing method S200 is implemented by the image processing system 100. A program which, when is implemented by the image processing system 100, causes the image processing system to execute the image processing method S200. Computer readable media stores the program.
In step S210, the first character recognition unit 121 performs the function of recognizing a first group of characters 111 corresponding to a first region of the image 300.
The first character recognition unit 121 performs the segmentation on the image, which identifies the characters in the image. The image is segmented into pieces, then each piece is recognized. The pieces are put together and contextual information is used to make a decision in ambiguous cases. The over segmentation identifies the words of the document, each word comprising a group of characters. Over segmentation identifies the lines of text included in the document, each line of text comprising a group of words. Words and lines can be used to provide context for recognizing characters in the image.
The first character recognition unit 121 performs character recognition for the entire document, so that
BE2019 / 5125 all text in the image will be analyzed. Advantageously, this is done quickly and provides a first technique for identifying the text in the document.
Over segmentation is used to identify the words and characters of the text extracted from the image. A first group of characters corresponds to a word identified in the text. The first group of characters is a subset of the text extracted from the image. The first region is a part of the image that includes the first group of characters. Text accuracy can be improved by identifying the first regions for which character recognition by the first character recognition unit 121 is of poor quality.
In some cases, the text includes a plurality of groups of characters for which the recognition of characters by the first character recognition unit 121 is of low quality. In this case, a plurality of first regions of the image are identified, each of the first regions corresponding to a different first group of characters. Advantageously, the accuracy of the text can be improved by identifying a plurality of errors which must be corrected.
Each first region is associated with the first corresponding group of characters which have been recognized. It is therefore possible to map the input image to the output text. The association of the first region with the first group of characters is useful if the accuracy of the first group of characters needs to be examined in more detail by performing character recognition again for the first group of characters. In addition, it is useful to have a mapping between the input image and the output text when adding a layer to the image to provide selectable machine-readable text that overlaps the original image. of the document.
In step S220, the measurement unit 131 performs the function of calculating a confidence measurement of the first group of characters 111.
BE2019 / 5125 The confidence measure identifies the confidence level for each of the characters which is detected by a character recognition unit 120. Advantageously, the confidence level makes it possible to identify and eliminate errors in the text. out of the first character recognition unit 121.
Errors generally occur if the image includes a style that has never been encountered before, such as a different font or text that has been hidden. There may be punctuation recognition errors, which makes character recognition difficult. In addition, image defects can hide the text. The quality of the image influences the errors encountered when recognizing the text, as this introduces ambiguities. It is difficult to recognize characters if there are not enough pixels, because a low resolution reduces the accuracy of mapping on a set of characters stored in memory. It is particularly difficult to identify low text because it results in characters that have a small number of pixels.
A low confidence measure indicates that the recognition by the character recognition unit 120 includes errors. Various techniques are available to identify errors, for example:
- assign a weighting, W, to each character which identifies a probability that the recognized character precisely represents the character identified in the image;
- assign an average weighting (W) to each word which represents an average weighting for all the characters of the word;
- assign a maximum weighting (W) to each word which represents the maximum weighting for a particular character of the word;
BE2019 / 5125
- assign a weight to each line which represents an average weight or a maximum weight for all the characters of the line
J
- perform a spell check to determine if the detected words are included in a dictionary;
- determine if the words detected contain inconsistent characteristics, such as the presence of punctuation;
- compare the different words that have been recognized to assess whether they have an appropriate context, such as checking grammar;
- determine the number of pixels that make up the character in the image, as this indicates the resolution of the first region that was used to obtain the first group of characters;
- determine the height of the characters in the image, because a low character height results in a low number of pixels making up the character; and
- any combination of the above techniques, such as the combination of the average (W) and maximum (W) measurements.
In addition to the association between the first region and the first group of characters, these two elements are also associated with the confidence measure. For the situation in which a plurality of first regions have been identified corresponding to a plurality of first groups of characters, a plurality of confidence measures is calculated. It is possible that the identification of the first region is done retrospectively, once the first group of characters is identified as having a low confidence value.
In step S230, the determination unit 132 performs a function consisting in determining whether an additional recognition must be carried out on the basis of the confidence measurement. If the confidence measure is weak, this indicates that the first group of characters may contain an error. Therefore, if the confidence measure is below a threshold, it means that additional processing must be performed. In
BE2019 / 5125 the case where a plurality of first characters are identified as having a weak confidence measure, computer resources are allocated to the additional recognition for the weakest confidence measures, by selecting a maximum number of first groups of characters for which additional recognition must be performed.
The confidence measure corresponds to the first group of characters. Thus, the confidence measure corresponds to the first region. A mapping between the machine-coded text and the image could occur after the confidence measure is calculated, so that the first group of characters is associated with the first region. Alternatively, the mapping between the first region and the first group of characters could be established before calculating the confidence measure.
If no other recognition is necessary, the S200 method ends, which corresponds to the situation in which no error has been identified in the machine-coded text which has been output by the first recognition unit 121 However, if additional recognition is to be performed, the method S200 goes to step S240. In the case where a plurality of first regions have been identified, the method S200 goes to step S240 for the first regions for which it is determined that an additional recognition must be carried out. Therefore, if additional recognition is not required, this saves resources and accelerates the performance of character recognition.
The determination of the need for additional recognition is based on the confidence measure. If the threshold measurement is less than a threshold value, this indicates that the quality of the first character recognition is low and that an additional recognition must therefore be carried out. In particular, account is taken of the weight values which constitute the measure of
BE2019 / 5125 trust. In addition, it is possible to take into account the number of pixels making up the characters, for example by determining the height of the characters.
For the situation in which a plurality of first regions has been identified, each of the first regions is sorted on the basis of the confidence measure. Advantageously, the first regions which most need additional recognition are given priority in the allocation of resources. The amount of processing available for further recognition is limited and therefore a maximum number of first regions can be analyzed in more detail. This maximum number can be selected by the user, determined based on the size of the image document, or determined by evaluating the plurality of confidence measures that have been calculated. Alternatively, sorting the plurality of first regions allows additional recognition to be performed until the available resources have been exhausted, for example if there is a limited amount of processing available for additional recognition, or a timer. indicating that no additional time is available for processing the additional recognition.
In step S240, the selection unit 133 performs the function of selecting a second region of the image 300 which includes the first region, if it is determined that additional recognition must be carried out for the first region.
The first region corresponds to a group of characters forming one or more words. The second region includes the first region, since an additional recognition step must be performed for this first region. However, the second region is larger than the first region because the second region includes parts of the image that will provide context for the first region. The second region
BE2019 / 5125 includes additional information to the first region, such as:
- words which are adjacent to the first region;
- the entire line of text that includes the first region; and
- the parts of the image that have been identified as providing context for the first region.
In step S250, the second character recognition unit performs the function of additional recognition of a second group of characters 222 corresponding to the second region of the image 300.
The second region is a subset of the image. Thus, while the first character recognition unit 121 performs character recognition on the entire document, the second character recognition unit 122 performs character recognition on a much smaller part of the image. Therefore, the second character recognition unit 122 is focused on the second region, which has been identified as including an error in the first region. In addition, the second character recognition unit 122 uses the additional information which is identified as providing context to the first region.
The output of the second character recognition unit 122 should be more precise than that of the first character recognition unit 121. Consequently, the corresponding part of the text which is output by the first character recognition unit 121 is replaced by the output of the second character recognition unit 122. Advantageously, the accuracy of character recognition is improved by the use of a plurality of character recognition units 120 which are adapted to the image analyzed. , while balancing the allocation of IT resources.
BE2019 / 5125 As an alternative, the plurality of character recognition units 120 may include other character recognition units which are specialized for correcting errors in character recognition. The second character unit 122 is adapted to perform character recognition for a specific type of image, such as low quality scanning. Thus, the second character unit 122 is selected based on the fact that the second region is identified as being of low quality. Consequently, the image processing method S200 is performed for the image using the appropriate character recognition unit 120. Advantageously, the second most suitable character recognition unit 122 is selected to carry out the additional recognition.
FIG. 3A is a diagram illustrating how the image processing method S200 identifies the text included in the image 300.
In step S210, character recognition is carried out by the first character recognition unit 121 on the first region 1, thus obtaining the first group of characters 111. Steps S220-S240 are carried out to determine whether a additional recognition of the first region 1 must be performed.
In step S250, the character recognition is carried out by the second character recognition unit 122 on the second region 2, thus obtaining the second group of characters 222.
FIG. 3A illustrates the second region 2 corresponding to a line of text. A line of text is selected because it is considered to be capable of providing a context for the analysis of the first region 1. Advantageously, the second character recognition unit 122 is adapted to analyze low quality images and,
BE2019 / 5125 therefore, the second group of characters should have a higher confidence measure than the low confidence measure which was determined when carrying out the recognition of characters from the first region 1 with the first recognition unit of characters 121.
FIG. 3B is a diagram illustrating how the image processing method S200 identifies the text included in the image 300.
In step S210, the character recognition is carried out by the first character recognition unit 121 on the plurality of the first regions 1A-C, thus obtaining the plurality of the first groups of characters 111A-C. Optionally, the first character recognition unit 121 is configured to analyze the entire document, although alternatively, the first character recognition unit 121 is configured to analyze part of the document. Steps S220-S240 are performed to determine if additional recognition is to be performed for each of the first regions 1A-1C.
In step S250, character recognition is carried out by the second character recognition unit 122 on the plurality of second regions 2A-C, thereby obtaining the plurality of second groups of characters 222A-C.
FIG. 3B illustrates the second region 2A-C corresponding to the words which are adjacent to the first region 1A-C. One or more words adjacent to the first region 1A-C can be used. The number of words to be included in the second region 2A-C is specified in advance. Alternatively, the number of words can be determined by establishing whether there are enough words to provide context. If the first region 1A is a first word, there will be no words before the first region 1A, so the second region 2A will be composed of adjacent words which
BE2019 / 5125 appear after the first region 1A. Likewise, if the first region is a last word, there will be no words after the first region 1A, so that the second region 2A will be composed of adjacent words which appear before the first region 1A.
The FIGs. 3A and 3B illustrate examples of the image processing method S200 performed by the image processing system 100, which can be provided separately or in combination. The selection unit 133 accordingly selects the second region according to the settings which have been selected in advance to know whether a line should be selected according to Figure 3A or whether the adjacent words should be selected according to Figure 3B. It is also possible to assess whether the second region provides the context to be used by the second character recognition unit 122.
The FIGs. 4A-C and 5A-C illustrate examples of how the first region 111 and the second region 222 can be identified. As an alternative, the examples in FIGs. 4A-C and 5A-C can be provided in combination to identify the first region 111 and the second region 222. In addition, FIGS. 4A-C and 5A-C serve to illustrate how the context provided by the second region 222 can be used to perform the recognition of the first region 111.
The FIGs. 4A-C provide an example of character recognition which uses context, for which the confidence value associated with a first group of characters 41 is used to determine that the additional recognition must be carried out for the first region 111 by the second unit of character recognition 122.
FIG. 4A provides an example of a first group of characters 41 which is determined to contain an error based on the confidence measure.
BE2019 / 5125 [00101] The context is very important when reading a line of text. For example, what do you read
M4K35 The unit of measurement calculates a confidence value, which is low because the first group of characters 41 includes both letters and numbers. Consequently, the determination unit 132 establishes that an additional recognition must be carried out.
The first group of characters 41 corresponds to the first region 111.
[00104] FIG. 4B gives an example of a group of characters 42 which includes the first group of characters 41.
Try to read this line:
EXAMPLE OF A LINE WHERE CONTEXT M4K35 A DIFFERENCE.
(in French: "EXAMPLE OF A LINE WHERE THE CONTEXT M4K35 A DIFFERENCE.").
The selection unit 133 identifies the characters 42 taken out of the first character recognition unit 121 which are candidates for providing a context to the first group of characters 41.
The context can be active or passive. As the first example of passive context provided, the characters 42 can be identified as being on the same line of text as the first group of characters 41. As the second example of passive context, the characters 42 can be identified as being words adjacent to the first character group 41. An example of active context being provided, a context measurement can positively identify that the group of characters 42 will provide the context to the first group of characters 41.
BE2019 / 5125 The selection unit 133 uses the group of characters 42 to identify the second region 222 which will be useful for providing additional recognition.
[00109] FIG. 4C provides an example of a second group of characters 43 for which the errors have been corrected.
The second group of characters 43 is output by the second character recognition unit 122 which performs the recognition of characters from the second region 222 of the image.
[00111] Consequently, the text is corrected as follows:
EXAMPLE OF A LINE WHERE CONTEXT MAKES A DIFFERENCE. (in French: "EXAMPLE OF A LINE WHERE THE CONTEXT MAKES A DIFFERENCE.").
For the second group of characters 43 output by the second character recognition unit 121, the measurement unit 131 calculates a level of confidence greater than the group of characters 42 output by the first character recognition unit 122.
The errors introduced by the first character recognition unit 121 have been corrected by the second character recognition unit 122. Consequently, the characters 42 which have been output by the first character recognition unit 121 are replaced by the second group of characters 43 which have been output by the second character recognition unit.
The FIGs. 5A-C provide another example of character recognition which uses context, for which the confidence value associated with a first region 51 is used to
BE2019 / 5125 determine that the additional recognition must be carried out by the second character recognition unit 122.
[00115] FIG. 5A provides an example of a first region 51 detected by the image processing system 100.
The first character recognition unit 121 performs character recognition on the complete image. Consider the situation in which a first group of characters 111 is identified comprising two or three characters which form a single word. In addition, the first group of characters 111 is recognized as the LO machine-coded characters. The first group of characters 111 is associated with the first region 51. The measurement unit 131 calculates a low confidence value, which can be explained by:
- The number of pixels in region 51 is small;
the pixels do not precisely map with any of the machine-coded characters stored by the first character recognition unit 121; and
- spell checking of the word LO indicates that there is likely to be an error.
It is difficult to visually identify the letters corresponding to the image shown in FIG. 5A, because the image quality is poor and there is no context to determine the meaning of the pixels detected.
[00118] FIG. 5B provides an example of a second region 52 selected by the image processing system 100.
The first region 51 is included in the second region 52. The second region 52 provides the context for the first region 51 by including some of the words that are adjacent to the first region 51.
BE2019 / 5125 [00120] The second character recognition unit 122 is performed on the second region, which gives the second group of characters:
describes in greater detail, (in French: describes in more detail).
[00121] FIG. 5C provides an example of a text line 53. The first region 51 of line 53 corresponds to the first region 51 shown in FIG. 5A. The second region 52 of line 53 corresponds to the second region 52 shown in Figure 5B.
The context provided to the first region 51 by the adjacent words which are included in the second region 52 results in an increased confidence measure. Consequently, the second group of characters 222 replaces the corresponding characters which have been recognized by the first character recognition unit 111.
Consequently, the line of text 53 is recognized as follows:
The next section describes in greater detail, (in French: The next section describes in more detail).
The above examples can also be carried out by a computer of a system or of a device (or of devices such as a CPU or MPU) which reads and executes a program recorded on a memory device to carry out the functions of the examples described above, and by a method the steps of which are carried out by a computer of a system or device, for example, the reading and the execution of a program recorded on a memory device to carry out the functions of the examples described above. To this end, the program is
BE2019 / 5125 supplied to the computer, for example, via a network or a recording medium of various types serving as a memory device (for example, computer-readable medium such as non-transient medium readable by computer).
[00125] Although the present invention has been described with reference to embodiments, it should be understood that the invention is not limited to the embodiments disclosed. The present invention can be implemented in various forms without departing from the main features of the present invention. The scope of the following claims should be interpreted in the widest sense to encompass all of these modifications and the equivalent structures and functions.
权利要求:
Claims (18)
[1]
1. Image processing method for recognizing characters included in an image, the image processing method comprising the steps consisting in:
performing recognition in a first region of the image, by a first character recognition unit, of a first group of characters corresponding to the first region of the image;
calculating a confidence measure of the first group of characters;
determining whether additional recognition should be made on the basis of the confidence measure;
selecting a second region of the image that includes the first region, if it is determined that further recognition is to be performed; and carrying out an additional recognition in the second region of the image, by a second character recognition unit, of a second group of characters corresponding to the second region of the image.
[2]
2. Method according to the preceding claim, further comprising the steps consisting in:
performing recognition in a plurality of the first regions of the image, by the first character recognition unit, of a plurality of first groups of characters corresponding to the plurality of first regions of the image;
calculating a confidence measure for each of the first groups of characters;
determining whether an additional recognition should be carried out for each of the first groups of characters, on the basis of the corresponding confidence measure;
selecting a plurality of second regions of the image which each include the first corresponding region, if it is determined that further recognition is to be performed; and
BE2019 / 5125 perform the additional recognition in a plurality of the second image regions, by the second character recognition unit, of a plurality of second groups of characters corresponding to the plurality of second image regions.
[3]
The method of claim 2, wherein the step of determining whether further recognition by the second character recognition unit is to be performed comprises selecting a maximum number of first groups of characters, based on the confidence measure for each of the first groups of characters, wherein the first groups of characters correspond to a maximum number of confidence measures which are the lowest.
[4]
4. Method according to any one of the preceding claims, in which the recognition of the first group of characters comprises at least one of:
a matrix match, in which the first region is compared to a glyph; and a feature extraction, in which the first region is compared to a plurality of features of a glyph.
[5]
5. Method according to any one of the preceding claims, in which the confidence measure is based on at least one of:
an average weighting for all the characters of the first group of characters; and a maximum weighting for all the characters of the first group of characters.
[6]
6. Method according to any one of the preceding claims, in which it is determined that an additional recognition must be carried out if the confidence measurement is less than a threshold value.
BE2019 / 5125
[7]
The method according to any of the preceding claims, wherein it is determined that further recognition is to be made if the first group of characters matches the text in the first region which is identified as having at least one of:
a number of pixels less than a threshold value; and a height less than a threshold value.
[8]
8. Method according to any one of the preceding claims, in which the additional recognition of the second group of characters is adapted for at least one element from:
the second region of the image; and a low quality image.
[9]
The method according to any of the preceding claims, wherein the further recognition of the second group of characters uses a neural network which has been trained to recognize a plurality of word strings.
[10]
The method of any one of the preceding claims, wherein the second region further comprises words which are identified as being adjacent to the first region.
[11]
The method of any one of the preceding claims, wherein the second region further comprises words which are identified as being on the same line of text as the first region.
[12]
The method of any one of the preceding claims, wherein the second region further comprises words which are identified as providing context to the first region.
[13]
13. An image processing system for recognizing characters included in an image, the image processing system comprising:
BE2019 / 5125 a first character recognition unit configured to perform recognition in a first region of the image of a first group of characters corresponding to the first region of the image;
a measurement unit configured to calculate a confidence measure of the first group of characters;
a determination unit configured to determine whether further recognition is to be made based on the confidence measure;
a selection unit configured to select a second region of the image which includes the first region, if it is determined that further recognition is to be performed; and a second character recognition unit configured to perform additional recognition in the second region of the image of a second group of characters corresponding to the second region of the image.
[14]
The image processing system according to claim 13, in which the first character recognition unit is configured to perform at least one of:
a matrix match, in which the first region is compared to a glyph; and a feature extraction, in which the first region is compared to a plurality of features of a glyph.
[15]
15. An image processing system according to claim 13 or claim 14, in which the second character recognition unit performs an additional recognition of the second group of characters which is adapted for at least one of:
the second region of the image; and a region of an image that is of low quality.
BE2019 / 5125
[16]
The image processing system according to any of claims 13 to 15, wherein the second character recognition unit performs further recognition of the second group of characters using a neural network which has been trained to
5 recognize a plurality of word strings.
[17]
17. A program which, when implemented by an image processing system, causes the image processing system to carry out a method according to any one of claims 1 to 12.
[18]
18. A computer-readable medium which stores a program according to claim 17.
类似技术:
公开号 | 公开日 | 专利标题
FR2963695A1|2012-02-10|POLICE WEIGHT LEARNING FOR TEST SAMPLES IN MANUSCRIPTED KEYWORD REPORTING
BE1024194A9|2017-12-19|Method for identifying a character in a digital image
BE1025503B1|2019-03-27|LINE SEGMENTATION METHOD
JP5624004B2|2014-11-12|Method for binarizing a scanned document image containing gray or light color text printed with a halftone pattern
US20050259866A1|2005-11-24|Low resolution OCR for camera acquired documents
BE1022562A9|2017-02-07|Optical character recognition method
Shang et al.2014|Detecting documents forged by printing and copying
US9596378B2|2017-03-14|Method and apparatus for authenticating printed documents that contains both dark and halftone text
US10025976B1|2018-07-17|Data normalization for handwriting recognition
RU2581786C1|2016-04-20|Determination of image transformations to increase quality of optical character recognition
BE1026039B1|2019-12-13|IMAGE PROCESSING METHOD AND IMAGE PROCESSING SYSTEM
BE1026159B1|2020-05-08|IMAGE PROCESSING SYSTEM AND IMAGE PROCESSING METHOD
US6266445B1|2001-07-24|Classification-driven thresholding of a normalized grayscale image
US8773733B2|2014-07-08|Image capture device for extracting textual information
WO2013177240A1|2013-11-28|Textual information extraction method using multiple images
RU2597163C2|2016-09-10|Comparing documents using reliable source
JP7038988B2|2022-03-22|Image processing method and image processing system
US20210248402A1|2021-08-12|Information processing apparatus and non-transitory computer readable medium storing program
US20210248411A1|2021-08-12|Information processing apparatus and non-transitory computer readable medium storing program
BE1020588A5|2014-01-07|METHOD OF RECOGNIZING FORMS, COMPUTER PROGRAM PRODUCT, AND MOBILE TERMINAL.
Sharma et al.2021|Script Identification for Devanagari and Gurumukhi using OCR
CN110991303A|2020-04-10|Method and device for positioning text in image and electronic equipment
CN114020863A|2022-02-08|Visual question-answer analysis method, device and system and readable storage medium
Kumar et al.2018|Text Attentional Character Detection Using Morphological Operations: A Survey
BE1025134A1|2018-11-09|Method of identifying a character in a digital image
同族专利:
公开号 | 公开日
US20190266447A1|2019-08-29|
EP3759647A1|2021-01-06|
KR20200128089A|2020-11-11|
GB201803262D0|2018-04-11|
GB2571530B|2020-09-23|
CN111630521A|2020-09-04|
JP2021502628A|2021-01-28|
GB2571530A|2019-09-04|
WO2019166301A1|2019-09-06|
US11170265B2|2021-11-09|
BE1026039A1|2019-09-18|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题
US20040037470A1|2002-08-23|2004-02-26|Simske Steven J.|Systems and methods for processing text-based electronic documents|
US20140212039A1|2013-01-28|2014-07-31|International Business Machines Corporation|Efficient Verification or Disambiguation of Character Recognition Results|
JPS5635276A|1979-08-30|1981-04-07|Toshiba Corp|Rejected character processing system for optical character reading device|
US5251268A|1991-08-09|1993-10-05|Electric Power Research Institute, Inc.|Integrated method and apparatus for character and symbol recognition|
US5442715A|1992-04-06|1995-08-15|Eastman Kodak Company|Method and apparatus for cursive script recognition|
JP3260979B2|1994-07-15|2002-02-25|株式会社リコー|Character recognition method|
US5835633A|1995-11-20|1998-11-10|International Business Machines Corporation|Concurrent two-stage multi-network optical character recognition system|
US6104833A|1996-01-09|2000-08-15|Fujitsu Limited|Pattern recognizing apparatus and method|
US5933531A|1996-08-23|1999-08-03|International Business Machines Corporation|Verification and correction method and system for optical character recognition|
US8428332B1|2001-09-27|2013-04-23|Cummins-Allison Corp.|Apparatus and system for imaging currency bills and financial documents and method for using the same|
US8929640B1|2009-04-15|2015-01-06|Cummins-Allison Corp.|Apparatus and system for imaging currency bills and financial documents and method for using the same|
US8391583B1|2009-04-15|2013-03-05|Cummins-Allison Corp.|Apparatus and system for imaging currency bills and financial documents and method for using the same|
JP3919617B2|2002-07-09|2007-05-30|キヤノン株式会社|Character recognition device, character recognition method, program, and storage medium|
DE50305344D1|2003-01-29|2006-11-23|Harman Becker Automotive Sys|Method and apparatus for restricting the scope of search in a dictionary for speech recognition|
US8868555B2|2006-07-31|2014-10-21|Ricoh Co., Ltd.|Computation of a recongnizability score for image retrieval|
US8156116B2|2006-07-31|2012-04-10|Ricoh Co., Ltd|Dynamic presentation of targeted information in a mixed media reality recognition system|
US9176984B2|2006-07-31|2015-11-03|Ricoh Co., Ltd|Mixed media reality retrieval of differentially-weighted links|
US8489987B2|2006-07-31|2013-07-16|Ricoh Co., Ltd.|Monitoring and analyzing creation and usage of visual content using image and hotspot interaction|
US8825682B2|2006-07-31|2014-09-02|Ricoh Co., Ltd.|Architecture for mixed media reality retrieval of locations and registration of images|
US8510283B2|2006-07-31|2013-08-13|Ricoh Co., Ltd.|Automatic adaption of an image recognition system to image capture devices|
US20080008383A1|2006-07-07|2008-01-10|Lockheed Martin Corporation|Detection and identification of postal metermarks|
US8417017B1|2007-03-09|2013-04-09|Cummins-Allison Corp.|Apparatus and system for imaging currency bills and financial documents and method for using the same|
GB2466597B|2007-09-20|2013-02-20|Kyos Systems Inc|Method and apparatus for editing large quantities of data extracted from documents|
US8755604B1|2008-06-05|2014-06-17|CVISION Technologies, Inc.|Using shape similarity methods to improve OCR speed and accuracy|
JP2010217996A|2009-03-13|2010-09-30|Omron Corp|Character recognition device, character recognition program, and character recognition method|
US8620078B1|2009-07-14|2013-12-31|Matrox Electronic Systems, Ltd.|Determining a class associated with an image|
JP4985724B2|2009-07-30|2012-07-25|富士通株式会社|Word recognition program, word recognition method, and word recognition device|
US8401293B2|2010-05-03|2013-03-19|Microsoft Corporation|Word recognition of text undergoing an OCR process|
US9141877B2|2012-01-25|2015-09-22|The United States Of America As Represented By The Secretary Of The Air Force|Method for context aware text recognition|
US9262699B2|2012-07-19|2016-02-16|Qualcomm Incorporated|Method of handling complex variants of words through prefix-tree based decoding for Devanagiri OCR|
US8947745B2|2013-07-03|2015-02-03|Symbol Technologies, Inc.|Apparatus and method for scanning and decoding information in an identified location in a document|
US9171224B2|2013-07-04|2015-10-27|Qualcomm Incorporated|Method of improving contrast for text extraction and recognition applications|
US20150227787A1|2014-02-12|2015-08-13|Bank Of America Corporation|Photograph billpay tagging|
US9798943B2|2014-06-09|2017-10-24|I.R.I.S.|Optical character recognition method|
US9740928B2|2014-08-29|2017-08-22|Ancestry.Com Operations Inc.|System and method for transcribing handwritten records using word groupings based on feature vectors|
US10135999B2|2016-10-18|2018-11-20|Conduent Business Services, Llc|Method and system for digitization of document|
JP2018112839A|2017-01-10|2018-07-19|富士通株式会社|Image processing program, image recognition program, image processing device, image recognition device, image recognition method, and image processing method|
US10885531B2|2018-01-29|2021-01-05|Accenture Global Solutions Limited|Artificial intelligence counterfeit detection|US20200311411A1|2019-03-28|2020-10-01|Konica Minolta Laboratory U.S.A., Inc.|Method for text matching and correction|
法律状态:
2020-01-23| FG| Patent granted|Effective date: 20191213 |
优先权:
申请号 | 申请日 | 专利标题
GB1803262.3A|GB2571530B|2018-02-28|2018-02-28|An image processing method and an image processing system|
[返回顶部]